Documentation Home Give Thanks / Docs & Website / Email Support

XML: Services Lists

This documentation outlines the principles and format behind the service lists that we keep for myChannels.opml, myStrips.opml and myWebCams.opml. As per the myServices document, the format closely resembles Dave Winer's Userland format, but OCS has been taken into consideration as well.

There are three kinds of services lists available for each type of service:

  • The complete list is every service still operational - we don't check for validity, we don't check for freshness, we don't check for content - we only check if it's there. For the channels service list, this file would be called services-channels-complete.xml.
  • The recent list is a paired down version of the complete list, and contains services updated in the past month. For the channels service list, this file would be called services-channels-recent.xml.
  • Finally, the failure list are services that 404'd or host unreachable, had internal server errors, or were otherwise not valid XML. For the channels service list, this file would be called services-channels-failure.xml. The failure list is randomly checked for revived listings, which are added back to the complete list.

A complete (yet rather minimal) service is shown below.

<?xml version="1.0"?>
<servicelist>
  <header>
    <docs>http://www.disobey.com/amphetadesk/xml_services_lists.htm</docs>
    <entries>1255</entries>
    <updated>Thu, 15 Mar 2002 00:10:39 GMT</updated>
    <version>1</version>
  </header>
  <services>
    <service>
      <added>Thu, 15 Mar 2002 14:28:19 GMT</added>
      <description>The best little example description around!</description>
      <error>No headers downloaded.</error>
      <htmlurl>http://www.superlugnuts.com/example.html</htmlurl>
      <id>f3cc9706db585cd9b776ae143897106b</id>
      <imageurl>http://www.superlugnuts.com/image.jpg</imageurl>
      <language>en</language>
      <lastchecked>Mon, 05 Mar 2002 14:32:37 GMT</lastchecked>
      <lastmodified>Thu, 08 Jun 2000 13:40:05 GMT</lastmodified>
      <timeschecked>2</timeschecked>
      <title>Super LugNuts and Happy Examples</title>
      <xmlurl>http://www.superlugnuts.com/example.xml</xmlurl>
    </service>
  </services>
</servicelist>

You can see how this would show up if a user subscribed to it by checking out the myServices documentation.

<added>

Allowed Within: <service>
Frequency: Once
Required: Yes

<added> contains the date the service was added, in GMT.

<description>

Allowed Within: <service>
Frequency: Once
Required: No

The <description> element is determined by the actual service and can be a string of any length. Most of the time, the service author will put a little blurb about the service - other authors will put the last modified date of the service.

<docs>

Allowed Within: <header>
Frequency: Once
Required: Yes

<docs> points to a valid URI explaining what all this means.

<entries>

Allowed Within: <header>
Frequency: Once
Required: Yes

<entries> remarks the total number of services within the list.

<error>

Allowed Within: <service>
Frequency: Once
Required: No

<error> represents a short, simple, generic message (or number, see below) for why the service is considered a failure. Current applicable values are "No headers downloaded" (common when the service host has timed out on responding, or when the service host is unavailable) and "Error parsing XML" (common due to custom 404 pages, redirects to home pages, or when the data isn't valid XML).

If <error> shows up within our complete or recent service lists, then it represents the number of consecutive times the service has errored. After three consecutive errors, the service is automatically added to the failure list.

<header>

Allowed Within: <servicelist>
Frequency: Once
Required: Yes

The <header> block portrays some housecleaning and administrative information, such as <docs> (where to go for more information), <entries> (how many services are in the list), <updated> (when the service list was updated), and <version> (what evolution this service list is at).

<htmlurl>

Allowed Within: <service>
Frequency: Once
Required: No
See Also: <imageurl>, <xmlurl>

Convincingly enough, the <htmlurl> attribute contains the full URI to the webpage of the service in question.

<id>

Allowed Within: <service>
Frequency: Once
Required: Yes

<id> is a unique identifier for the service and is a 32 character md5 hash of the xmlurl (the xmlurl is just a seed, and shouldn't be given any importance - in myStrips.opml and myWebCams.opml, you'd use the imageurl).

<imageurl>

Allowed Within: <service>
Frequency: Once
Required: Yes (myWebCams.opml, myStrips.opml); No (myChannels.opml)
See Also: <htmlurl>, <xmlurl>

Although <imageurl> sounds easy enough, there is a bit of confusion concerning what exactly it means in different contexts. In myChannels.opml, it means nothing, and you'll never see it. If, on the other hand, you're messing with myStrips.opml and myWebCams.opml, the <imageurl> points to the URI of the comic strip or webcam image in question.

<language>

Allowed Within: <service>
Frequency: Once
Required: No

<language> holds the dialect that the service is published in. This is currently ignored, but that may change in the future (perhaps a filter by language? auto creation of urls for net translators? any other suggestions?). Examples of valid languages: "en", "en-us", "fr", etc.

<lastchecked>

Allowed Within: <service>
Frequency: Once
Required: No
See Also: <timeschecked>

<lastchecked> tells us when the last time the service entry was checked for information - whether it be for existence, title and description, or just for the last modification date.

<lastmodified>

Allowed Within: <service>
Frequency: Once
Required: No

If the webserver reports a "Last-Modified:" field in the response request of the service, we insert that value in this element. Typically, the results are in GMT. Not all servers implement the "Last-Modified:" response, so we can't rely on it as being an adequate measure of service age.

<service>

Allowed Within: <services>
Frequency: Multiple
Required: Yes

The <service> tag contains one service definition. There can be an unlimited number of services in a single file.

<timeschecked>

Allowed Within: <service>
Frequency: Once
Required: No
See Also: <lastchecked>

Much like <lastchecked> tells us when we last looked at the service for data, <timeschecked> measures how often we have looked at the service. This can be used to make determinations on failed services, or just as a measure of longevity in the service list.

<title>

Allowed Within: <service>
Frequency: Once
Required: Yes

<title> showcases the title or name of the service in question and can be any length.

<updated>

Allowed Within: <header>
Frequency: Once
Required: Yes

The last time the list was updated, in GMT.

<xmlurl>

Allowed Within: <service>
Frequency: Once
Required: Yes (myChannels.opml); No (myWebCams.opml, myStrips.opml)
See Also: <htmlurl>, <imageurl>

<xmlurl> is the full URI to the document that contains the channel xml. It's ignored for myStrips.opml and myWebCams.opml.

<version>

Allowed Within: <header>
Frequency: Once
Required: Yes

The <version> tag allows us to track different variations of the service list in question - the current version is "1".



Any questions about the above? Email morbus@disobey.com.
This footer was last updated 05/25/01; odds are the whole document was too.

 

Documentation Home Give Thanks / Docs & Website / Email Support